-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
convenient utils for mpi/hierarchies #915
Conversation
📝 WalkthroughWalkthroughThe changes introduce two new functions, Changes
Possibly related PRs
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Outside diff range and nitpick comments (1)
pyphare/pyphare/pharesee/hierarchy/hierarchy_utils.py (1)
43-70
: Consider extending MPI analysis capabilities.The new functions provide basic MPI rank analysis. Consider extending them with additional features to better support debugging and monitoring of patch distribution across ranks.
Suggested improvements:
- Add functions to analyze patch distribution statistics (min, max, average patches per rank).
- Add load balancing metrics to identify potential bottlenecks.
- Support time range analysis to track patch distribution changes over time.
- Consider creating a dedicated class to encapsulate these MPI analysis features.
Example structure:
@dataclass class MPIAnalysis: hierarchy: 'PatchHierarchy' def get_rank_count(self) -> int: """Wrapper for nbr_ranks""" pass def get_patch_distribution(self) -> Dict[float, Dict[int, int]]: """Wrapper for patch_per_rank""" pass def get_distribution_stats(self, time: Optional[float] = None) -> Dict[str, float]: """Calculate min, max, average patches per rank""" pass def get_load_balance_metrics(self, time: Optional[float] = None) -> Dict[str, float]: """Calculate load balancing metrics""" pass
def nbr_ranks(hier): | ||
""" | ||
returns the number of mpi ranks used in the given hierarchy | ||
""" | ||
max_rank = 0 | ||
t0 = hier.times()[0] | ||
for _, lvl in hier.levels(t0).items(): | ||
for patch in lvl.patches: | ||
rank = patch.attrs["mpi_rank"] | ||
if rank > max_rank: | ||
max_rank = rank | ||
return max_rank | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add input validation and type hints for better reliability.
The function should include input validation and type hints to improve reliability and maintainability.
Consider applying these improvements:
-def nbr_ranks(hier):
+from typing import TYPE_CHECKING
+if TYPE_CHECKING:
+ from .hierarchy import PatchHierarchy
+
+def nbr_ranks(hier: 'PatchHierarchy') -> int:
"""
returns the number of mpi ranks used in the given hierarchy
+
+ Args:
+ hier: The hierarchy to analyze
+
+ Returns:
+ int: The maximum MPI rank found in the hierarchy
+
+ Raises:
+ KeyError: If a patch is missing the 'mpi_rank' attribute
"""
+ if not hier or not hier.times():
+ return 0
+
max_rank = 0
t0 = hier.times()[0]
for _, lvl in hier.levels(t0).items():
for patch in lvl.patches:
+ if "mpi_rank" not in patch.attrs:
+ raise KeyError(f"Patch {patch.id} is missing 'mpi_rank' attribute")
rank = patch.attrs["mpi_rank"]
if rank > max_rank:
max_rank = rank
return max_rank
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def nbr_ranks(hier): | |
""" | |
returns the number of mpi ranks used in the given hierarchy | |
""" | |
max_rank = 0 | |
t0 = hier.times()[0] | |
for _, lvl in hier.levels(t0).items(): | |
for patch in lvl.patches: | |
rank = patch.attrs["mpi_rank"] | |
if rank > max_rank: | |
max_rank = rank | |
return max_rank | |
from typing import TYPE_CHECKING | |
if TYPE_CHECKING: | |
from .hierarchy import PatchHierarchy | |
def nbr_ranks(hier: 'PatchHierarchy') -> int: | |
""" | |
returns the number of mpi ranks used in the given hierarchy | |
Args: | |
hier: The hierarchy to analyze | |
Returns: | |
int: The maximum MPI rank found in the hierarchy | |
Raises: | |
KeyError: If a patch is missing the 'mpi_rank' attribute | |
""" | |
if not hier or not hier.times(): | |
return 0 | |
max_rank = 0 | |
t0 = hier.times()[0] | |
for _, lvl in hier.levels(t0).items(): | |
for patch in lvl.patches: | |
if "mpi_rank" not in patch.attrs: | |
raise KeyError(f"Patch {patch.id} is missing 'mpi_rank' attribute") | |
rank = patch.attrs["mpi_rank"] | |
if rank > max_rank: | |
max_rank = rank | |
return max_rank |
def patch_per_rank(hier): | ||
""" | ||
returns the number of patch per mpi rank for each time step | ||
""" | ||
nbranks = nbr_ranks(hier) | ||
ppr = {} | ||
for t in hier.times(): | ||
ppr[t] = {ir: 0 for ir in np.arange(nbranks + 1)} | ||
for _, lvl in hier.levels(t).items(): | ||
for patch in lvl.patches: | ||
ppr[t][patch.attrs["mpi_rank"]] += 1 | ||
|
||
return ppr | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Enhance function with type hints, validation, and better documentation.
The function needs improvements in type hints, validation, and documentation. Also, the rank count initialization can be more efficient.
Consider applying these improvements:
-def patch_per_rank(hier):
+from typing import Dict, TYPE_CHECKING
+if TYPE_CHECKING:
+ from .hierarchy import PatchHierarchy
+
+def patch_per_rank(hier: 'PatchHierarchy') -> Dict[float, Dict[int, int]]:
"""
returns the number of patch per mpi rank for each time step
+
+ Args:
+ hier: The hierarchy to analyze
+
+ Returns:
+ Dict[float, Dict[int, int]]: A dictionary mapping timestamps to rank counts
+ where each rank count is a dictionary mapping rank IDs to patch counts
+
+ Example:
+ >>> ppr = patch_per_rank(hierarchy)
+ >>> print(ppr[0.0]) # Prints patch counts per rank at t=0.0
+ {0: 2, 1: 3, 2: 2} # Example output: rank 0 has 2 patches, rank 1 has 3, etc.
+
+ Raises:
+ KeyError: If a patch is missing the 'mpi_rank' attribute
"""
+ if not hier or not hier.times():
+ return {}
+
nbranks = nbr_ranks(hier)
ppr = {}
for t in hier.times():
- ppr[t] = {ir: 0 for ir in np.arange(nbranks + 1)}
+ ppr[t] = dict.fromkeys(range(nbranks + 1), 0)
for _, lvl in hier.levels(t).items():
for patch in lvl.patches:
+ if "mpi_rank" not in patch.attrs:
+ raise KeyError(f"Patch {patch.id} is missing 'mpi_rank' attribute")
ppr[t][patch.attrs["mpi_rank"]] += 1
return ppr
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def patch_per_rank(hier): | |
""" | |
returns the number of patch per mpi rank for each time step | |
""" | |
nbranks = nbr_ranks(hier) | |
ppr = {} | |
for t in hier.times(): | |
ppr[t] = {ir: 0 for ir in np.arange(nbranks + 1)} | |
for _, lvl in hier.levels(t).items(): | |
for patch in lvl.patches: | |
ppr[t][patch.attrs["mpi_rank"]] += 1 | |
return ppr | |
from typing import Dict, TYPE_CHECKING | |
if TYPE_CHECKING: | |
from .hierarchy import PatchHierarchy | |
def patch_per_rank(hier: 'PatchHierarchy') -> Dict[float, Dict[int, int]]: | |
""" | |
returns the number of patch per mpi rank for each time step | |
Args: | |
hier: The hierarchy to analyze | |
Returns: | |
Dict[float, Dict[int, int]]: A dictionary mapping timestamps to rank counts | |
where each rank count is a dictionary mapping rank IDs to patch counts | |
Example: | |
>>> ppr = patch_per_rank(hierarchy) | |
>>> print(ppr[0.0]) # Prints patch counts per rank at t=0.0 | |
{0: 2, 1: 3, 2: 2} # Example output: rank 0 has 2 patches, rank 1 has 3, etc. | |
Raises: | |
KeyError: If a patch is missing the 'mpi_rank' attribute | |
""" | |
if not hier or not hier.times(): | |
return {} | |
nbranks = nbr_ranks(hier) | |
ppr = {} | |
for t in hier.times(): | |
ppr[t] = dict.fromkeys(range(nbranks + 1), 0) | |
for _, lvl in hier.levels(t).items(): | |
for patch in lvl.patches: | |
if "mpi_rank" not in patch.attrs: | |
raise KeyError(f"Patch {patch.id} is missing 'mpi_rank' attribute") | |
ppr[t][patch.attrs["mpi_rank"]] += 1 | |
return ppr |
I've killed some builds for this as they have, from what I can see, no way to test these changes |
Summary by CodeRabbit